Homebrew Handbook for SNES
By Grog
(C) 1999 Realtime Simulations and Roleplaying games, inc.
Introduction
This doc is not intended for newbies to ASM programming. If you don't know the difference between a direct and an indirect memory access, I advise you to go learn the basics of ASM before you jump into SNES programming. I'm not going to make recommendations about the best ASM tutorial; I will say that learning ASM is not at all difficult. The difficulty is in developing large applications with such rudimentary tools. This doc is intended to provide an overall view of SNES programming and to provide some code that demonstrates the basic tasks required to make your SNES actually do something.
ASM programming is only one piece of the puzzel, of course. Somehow you have to create graphics and sound; you'll need lots of tools to write a complete SNES game.
| Tool Needed | Reccommended Freeware Program |
| 65816 Assembler | X816 |
| SNES Emulator w/Debugger | ZSNES, SNES9X, RSRSNES |
| SNES Documentation | Try www.classicgaming.com/EPR |
| SNES Emu Source | SNES9X, SNEESE |
| Tile editor | TLAYER |
| Map editor | Sorry, write your own. ;) |
| Sound sample Converter | WAV2BRR |
| Music Converter | MIDI2SPC |
| Text Editor | Microsoft Edit |
| Raw Data Editor | Your favorite hex editor |
| Misc. Utilities | Write simple QBasic or VBScript programs |
| Integrated Development Environment | SNIDE (hehe, someday), there is one other that I can't remember name of |
Of course, my recommendations are my opinions, and plenty of people would flame me for, say, recommending SNES9X as an emu or tlayer for tile editing. They're welcome to use whatever they want.
The uses for many of these tools are clear; obviously you'll need a tile editor. Some of the others might raise some eyebrows though. The SNES Emu Source is one big one; few people realize just how much trial and error goes into an emulator; looking at the source for a working emu is an easy way to fill the holes in the available SNES documentation. The authors of these programs have already done the difficult work of examining the use of every register and memory location in the SNES. Use their work! And be sure to credit them!
The "Misc. Utilities" catagory is pretty broad. Some things I've written that fall into this category are: tlayer .pal to SNES palette converters, raw data generators (like sin/cos data for sprite animation), data compressors (with coressponding decompressors in SNES ASM), blank .BIN generators (for later use with tlayer), batch files for compiling, and text script managers (for linking sprites to dialog).
The Map editor is a key component to any SNES development effort. Tiles only take you so far by themselves; you've got to arrange them into maps and find ways to compress the maps that don't take too long to decompress on a crappy 3.58mhz 65816. ;) I have found that it is often best to write map editors as SNES programs. Embed the tiles for the game into a small ROM with basic map editing capabilities. This allows you to see exactly what your map will look like in the real game, even with full animation effects. This also allows debugging of map compression and animation. The most important reason to use SNES ROMs for map editing is to ensure that the map can be decompressed and drawn fast enough to meet your frames-per-second goal, typically 20fps or 30fps for background animation. A completed map can be saved into the 8k SRAM, or simply saved in a ZST save state and extracted by another tool.
The Big Picture
Programming a game for SNES is a time consuming task that requires patience and good planning. Without a plan of attack, you've got no hope of ever writing anything better than a scrolling demo. Even the simplest task can be overwhelming in assembly language; the "top down" approach is probably the best way to develop your main program. First, write out a list of everything that the game engine needs to do. This can be done in conjunction with game design, where you lay down story line, game-play, and graphics-- but don't get carried away with the fun stuff until you've got a solid understanding of what can and cannot be done technically. The overall class of game is your first concern: platformer, shooter, arcade "beat the crap outta opponent" fight game, RPG, action/adventure, etc. Each one of these has its own quirks and program needs, but there are several key pieces common to any complex game:
Initialization
Intro/title screen
Map Decompression and rendering
Main Character rendering
Sprite rendering and AI
User input
Sound/Music
Text output, menus, etc
Ending credits
By far the hardest of these is the map decompression and rendering engine. The SNES has only 128k of RAM to work with-- not much space at all when you consider a full page map with dual backgrounds and 8x8 tiles takes up 4k of RAM just for the map data. Add in animation and "object" data and your RAM-per-page runs up quickly. Clearly RAM limitations require very clever ways of drawing maps. ROM space is also relatively limited; 4 megabytes don't go far with uncompressed world maps and tile graphics. These issues are covered in more detail later in other docs. My own SNES programming is currently centered around developing efficient map drawing routines. Any suggestions would be appreciated. ;)
In the SNES, the tasks listed above can be done at several different times:
As a background process constantly running
As a VBlank interrupt handler (NMI 60 times per second)
As an IRQ interrupt handler (scanline counter)
The VBlank (and HBlank if you're careful) interval is the only time you should access the VRAM when the screen is enabled. The VBlank period is limited to exactly 8,125 machine cycles (less when you account for the 33% delay of slowrom accesses). Another VBlank interrupt will not be processed until you return from the current handler; therefore you should limit your VBlank handler so that the game does not appear to "slow down" when you exceed one frame's processing time. (ie what happens in Zelda for NES when there are too many sprites on the screen)
One way to ensure there is no slow down is to perform the most time consuming tasks in the background; in my own programs that is when I setup the next frame of graphics and run sprite animation and AI tasks. The VBlank handler should be quick and do little more than page-flip to the next frame and check for button presses and sound status.
The IRQ handler can be used for special effects like "split screen" operation or to simply give another timing interrupt. The HBlank interval is limited to a paltry 56 cycles, and your timing must be perfect to send your data in that interval. Many emulators are not capable of timing that precise, so HBlank access to VRAM is not recommended.
Of course, these are just my impressions of "the way it should work"; other programmers have other ways to do things.
This is about as specific an overview as I can provide without actually picking a class of game for the discussion. So, I choose RPG. What else would I pick, right? :)
With our target game type in mind, we can now begin to write specifications for our game engine. My impression of the ideal RPG is Chrono Trigger, but we'll start simpler and design a Final Fantasy 1 style RPG engine, except with SNES quality graphics and sound. This means I can make the following assumptions:
Now we can begin writing pseudocode to lay out our game. Some people (mostly CS professors with no real world experiance) would say make a flowchart, but flowcharts simply take too long and are silly. Here's the pseudocode for the Map mode's NMI:
Note that my idea of Pseudocode is BASIC. That is not an accident; the above code is actually valid in SNIDE Basic. The process of writing a game in ASM is much like what a BASIC compiler would do with the above program. We have to fill in each subroutine, just like a BASIC programmer would. The difference, of course, is that the above program is a tad bit more complicated in ASM:
MainMapNMI:
PHP ;Preserve the flags registerREP #$30.Index 8.Mem 8 ;Allow the subroutines to assume 8-bit mem/index sizejsr ReadJoypadsjsr HandleButtonsjsr UpdateSpritesjsr UpdateMainCharacterjsr RenderBackground1jsr RenderBackground2jsr RenderMenus- lda $4210 ;clear NMI flag so more VBlanks can be processed
- plp
- rts
Another thing that should be clear is that programs should use subroutines wherever possible. Its a whole lot easier to read the above code and look up specific subroutines as needed than it is to wade through the same program written as one big block of code. Additionally, subroutines allow you to re-use code, saving ROM space and making it much easier to fix bugs. After all, if you have a bug in a piece of code you use 2000 times, you'd have to fix it 2000 times if it wasn't in a subroutine.
The Little Picture
It should be obvious by now that a whole SNES program will be huge. Writing a game is a pretty daunting task even for an experianced programmer. It helps to have a library of pre-written, pre-tested code that you can simply plug together to do what you want. My advice to anyone writing for SNES (or in assembly language in general) is to first work up a library of useful subroutines that you can use in your code. You then implement each subroutine called for in your main program using the library blocks to do most of the work. This is called "Bottom up" programming. So, to write a game, first use Top Down design to lay out the big picture, then use Bottom Up programming to implement each piece of the big picture.
Library functions are easy to use when they are properly documented. For example, a subroutine that loads a Palette into the SNES color registers should have comments that describe the parameters and return values of the subroutine:
- ;============================================================================
- ; LoadPalette -- Load palette data into PPU Color RAM
- ;----------------------------------------------------------------------------
- ; In: DS:X -- points to the palette data (in standard
SNES color format)
- ; Y --
Number of colors in palette - 1 (0 to 255)
- ; A --
First color index to write to (0 to 255)
- ;----------------------------------------------------------------------------
- ; Out: None
- ;----------------------------------------------------------------------------
- ; Modifies: none
- ;----------------------------------------------------------------------------
- LoadPalette:
- pha
- phx
- phy
-
php
;Preserve registers
- rep #$30
- sep #$20
- .mem 8
- .index 16
-
iny
;Increment Y (so Y=count instead of count-1)
- sta
$2121 ;Store the first
color index into CGADDR
-
- -
- lda
0,X ;Load
the first byte of the color
- sta
$2122 ;store it into
CGDATA
-
inx ;increment
input pointer
- lda
0,X ;Load
the second byte of the color
- sta
$2122 ;store it into
CGDATA
-
inx ;increment
input pointer
-
dey
;Decrement color count
- bne
-
;If color count is not zero (zero flag != 0), continue loop
-
-
plp
;Restore Registers
- ply
- plx
- pla
- rts
- ;============================================================================
Now, this example shows how I comment ALL of my code. Every subroutine has a comment block that describes its input and output parameters. It also describes which registers will be modified by the subroutine. In this example, no registers are modified (because I preserve them on the stack during the subroutine). When a program starts getting huge, these header comments are invaluable. For example, say my LoadPalette DIDN'T preserve the X or Y registers. If I didn't have that marked in the header, I'd never know it unless I went through every line of code in the subroutine to look for it. Even then I could miss something. With thoundands of lines of code in a program, its easy to forget what subroutines use what registers, so document it! I normally wouldn't presume (or waste time) teaching good programming style, since its none of my business, but I can tell you now that if you don't do this, you will almost certainly wind up wishing you did. Poor commenting is one of the key reasons assembly language is considered "hard".
Now, this example shows a more subtle problem with assembly language. In most cases when loading a palette there is no need to make the subroutine terribly fast. However, in many cases the time wasted pushing and pulling is large enough that it is truely wasteful. Notice that my pushes and pops on the stack seem to take up most of the subroutine. This is true, from a memory usage standpoint, but while the number of colors copied is greater than about 8, the loop will dominate execution time. If a small number of colors needs to be set, it would be (much) faster to directly write only those colors into CGRAM. Decisions like these represent the Time/Space/Simplicity tradeoff. Typically, you can make code faster by using more commands or more memory. If you need to reduce memory usage, typically you must sacrifice speed. To minimize either time or space, typically you produce more complex code, meaning it is harder to debug and modify. In the case of LoadPalette, I could remove the push/pop pairs and make it faster, but then LoadPalette is more complex to use, because the caller must worry about registers being modified.
Nowhere is the Time/Space/Simplicity tradeoff more evident than in graphics compression. To store an uncompressed 256x256 pixel image with 256 colors, you will use 64k of ROM space. That is a heck of a lot of space (for a SNES), allowing at most (with a 32mbit ROM) 64 full screen images on the SNES. Not since the Atari 2600 days has 64 screens of game data seemed big. However, uncompressed image data can be loaded into video memory moderately quickly, and the code to load it is about four lines long. Hence, complexity is minimized, speed is mediocre, but space is basically maximized.
In the SNES, the graphics are divided into 8x8 or 16x16 pixel blocks called Tiles. To construct a full screen image using tiles, first the tile data (typically 16-256 tiles) is loaded into VRAM. With 256 tiles in 256 color 8x8 mode, thats 16k of data. Then, a Map is loaded into VRAM. Each word (16 bits) of the map indicates a Tile number, palette, and orientation (flip tile vertically or horizontally). The Map organizes Tiles into the desired images. Often, a small number of tiles make up most of the image. Thus, fewer tiles are needed. Additionally, the tiles are usually common to many different Maps, meaning once the tiles are loaded, they need not be loaded again. In our example here, we're using 8x8 tiles, so a 256x256 screen is divided into 32x32 positions in the Map. Thus, the Map's size is 32x32 words, or 2kbytes of memory. So, to load a full screen image, we must store 18k of data in ROM, and load Two chuncks of memory. The copy routine is only twice as complex as in the Uncompressed case, but space is reduced by almost 75%! Equally important, since 46k less data needs to be loaded into VRAM, the routine is almost 75% faster as well.
Now, the 70% space savings offered by Tiles helps a lot, but in very very large games (ie Chrono Trigger sized) there is still too much data and too little space. In these cases we must Compress the raw tile and map data. There are many many types of compression, some easier than others to implement on SNES, but they share several common characteristics: the code is more complex, time is wasted decompressing the data, and RAM is wasted to hold decompressed data. All three of these problems make Compression an advanced topic. Someday when I'm bored I'll delve into the murky depths of RLE, Huffman, LZ and its variants, and other frightening compression methods.
Where To Now?
Well, that about does it for the Introduction. The rest of the SNES Homebrew Handbook is divided into Chapters, each one covering something different. Since this is an ongoing effort, new chapters may pop up every now and then with more advanced topics. Chapter 1 covers writing a very simple demo that loads an image to the screen. To do something even this simple requires a fair amount of code, as you'll see.